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Abstract 



The experimental problem of converting a measured binomial quantity, 
the fraction of events in a sample that pass a cut, into a physical binomial 
quantity, the fraction of events originating from a signal source, is described 
as a system of linear equations. This linear system illustrates several familiar 
aspects of experimental data analysis. Bayesian probability theory is used 
to find a solution to this binomial measurement problem that allows for the 
straightforward construction of confidence intervals. This solution is also 
shown to provide an unbiased formalism for evaluating the behavior of data 
sets under different choices of cuts, including a cut designed to increase the 
significance of a possible, albeit previously unseen, signal. 

Several examples are used to illustrate the features of this method, includ- 
ing the discovery of the top quark and searches for new particles produced in 
association with bosons. It is also demonstrated how to use this method 
to make projections for the potential discovery of a Standard Model Higgs 
boson at a Tevatron Run 2 experiment, as well as the utility of measuring the 
integrated luminosity through inclusive pp — ► production. 
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I. INTRODUCTION 



A set of experimental data is almost always presented in terms of the subset of interesting 
events, called the signal, and a complimentary subset of events from non-interesting sources, 
called the background. The fraction of events from each subset is a binomial quantity; some 
fraction of the sample is either characterized as signal or it is not. There is usually no exact 
means of separating signal events from background events. Instead an experimental cut is 
imposed on the original sample. Such a cut is motivated by independent studies that imply 
the cut will be more efficient for the events of interest than for the non-interesting events. 
After this cut is applied, estimates are made regarding the amount of signal and background 
in these new binomial subsets: those events that survived the cut, and those that failed the 
cut. 

The attempt to rotate data from the experimental axis in pass-fail space onto the physics 
axis that defines signal-background space is referred to in this paper as the measurement 
problem. The measurement problem is introduced and described in Section The clas- 
sical treatment of the measurement problem as a system of linear equations provides some 
insight into the practical business of analyzing data, but it is found to be inadequate for 
the construction of confidence intervals. A Bayesian analysis of related binomial quantities 
provides a straightforward solution to this problem. Bayesian descriptions of binomial data 
are given in Section [II. A full solution to the measurement problem is found in Section [TV. 

The solution of Section [TV] provides a result in terms of the fraction of signal events in 
the entire original sample; Section [V] reformulates this solution so that the result can be 
presented as a fraction of signal events in the subset of events that passed the cut. The 
methods introduced in Sections ^ and are demonstrated in Example 1 with the data 
used for the discovery of the top quark. Example 2 describes one way to use this method 
to estimate the necessary size of control samples in order to understand the background to 
inclusive pp — > production. 

Section [VT] presents a formalism for calculating the minimum number of events which 
must survive a cut designed to enhance the significance of a possible signal over expecta- 
tions. Section |V11| describes how to use the measurement problem to attribute a level of 
confidence in the consistency of a possible new discovery with the original understanding of 
the expected backgrounds. Examples 3 and 4 illustrate the methods of Sections |VT] and |V II 
using published pp — > + bb results. Example 5 extrapolates this Tevatron Run 1 data 
to the estimated amount of data available for a similar analysis in Run 2. 
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II. THE MEASUREMENT PROBLEM: A LINEAR SYSTEM FOR DATA 

ANALYSIS 



The measurement problem is equivalent to taking data that is recorded on an experi- 
mental axis, i.e. pass-fail space, and rotating the experimental results onto a physics axis, 
i.e. signal-background space. When a cut is imposed on a data sample of N^otai events, the 
sample is then divided into a subset of events which pass the cut N pass , and a subset of 
events which fail the cut Nf aii , 

NTotal = N pass + N fail . (1) 

The original sample can also be described as a subset of signal N sig and background N bkg 
events, 

N T otai = N sig + N bkg . (2) 
The different axes are related through a measurement matrix M, 



The measurement problem is to invert the matrix M such that 

( ^ T si9 ) = m~ i ( ^r s ) , (4) 

V N bkg ) v N f<* ) 

where the elements of the measurement matrix are the efficiencies of the cut on the signal 
and the background, 



m= L! ' J- (5) 




The efficiency e of the cut on the signal is defined as the number of signal events that 
will pass the cut, S pass , divided by the total number of signal events in the original sample; 
the number of signal events which will fail the cut Sf a u is the total number of signal events 
times the inefficiency (1 — e): 

Spass £ ^ sig ; (6a) 

Sfau = (1 - e) N sig . (6b) 

The efficiency e is always evaluated from some independent control sample of erotai 
diagnostic events, where e pass (£f a u) diagnostic events pass (fail) the cut; 

£ = Zpass/tTotal , (7a) 
I — S = Sfau/STotal ■ (7b) 

Similarly the efficiency of the cut on the background r, referred to as the 'rfficiency', is 
defined as the number of background events that will pass the cut B pass divided by the total 
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number of background events in the original sample, while the number of background events 
which will fail the cut Bj aiX is the total number of background events times the 'inrfficiency' 
(1-r): 

B P ass = r N bkg , (8a) 
B fml = (1 - r) N bkg . (8b) 

Just as the efficiency is evaluated from an independent diagnostic sample, the rfficiency 
comes from another independent sample of r Tota i events where r pass (rf a u) diagnostic events 
pass (fail) the cut; 

r = r pass /r T otai , (9a) 
1 - r = r f ai i /r To tai ■ (9b) 

It is common to refer to the rejection factor R of a cut as the ratio of the different 
efficiencies 

R=-, (10) 

r 

while the enhancement E of a cut on a given sample can be defined as 

F — T 

E=- - = R-1. (11) 

r 

The inverse measurement matrix M _1 , 

M-^^f 1 -; - r ) , (12) 

e-r \e-l e J ' 

exists only if the determinant of matrix M is not equal to zero, which is true whenever 
f 7^ r. This requirement is naturally satisfied whenever the rejection factor is not equal to 
one or the enhancement is non-zero. Usually a cut is chosen such that 

< r < f < 1 . (13) 

There is nothing in this formalism which prevents the choice of a cut such that e < r; this 
is the situation where the background is enhanced at the expense of the signal. 

Once the inverse measurement matrix is known, it is possible to describe the number of 
signal (or background) events in terms of the number of events which pass (or fail) the cut: 

M _ (l-r) N p a SS -r NfgU 
^ sin 

F — r 

= N ^~ rNT ^ , (14a) 
f — r 

(e - 1) N pass + f N f aa 



Nbkg — 



g Nrotal — Npass 

f — r 

N fail - (1 - f) N Tota i 



(14b) 

F — r 
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In fractional terms, defining 



then 



fsig = N sig l Nxotal i 
fbkg = Nbkg/NTotal , 
fpass = pas s / Total i 
ffail = Nf ai i/N T otal , 



r _ fpass f 



fbkg 



e — r 

ffail ~ (1 ~ g) 

e — r 



or 



fpass (g fsig f i 

ffail = (e - r) f bkg + (!-£). 



(15a) 
(15b) 
(15c) 
(15d) 



(16a) 
(16b) 



(17a) 
(17b) 



The fraction of events which pass the cut f pass will always be found in the interval 
< fpass < 1) and it is natural to restrict the fraction of signal events in the total sample 
fsig to the same interval. Practically this means that a physical solution to the measurement 
problem exists only if r < f pass < e. If the fraction of events that pass the cut is greater than 
the efficiency or less than the rfficiency, then the estimates of e and r need to be reevaluated, 
as they are almost certainly incorrect. 

It is possible that there is more than one background present in the original sample, and 
that a cut has different rfficiencies for different backgrounds, e.g., 




e r\ r 2 
1 — e 1 — r\ 1 — r 2 



I Nsig \ 



(18) 



Such problems can always be reduced to the form 



N, 



pass 



N ■ 

1 > sig 
2^ ly bkg 



\ N fail J \ 1 - e 1 - r> 
where the total rfficiency r' is the weighted sum of the individual rfficiencies, 

■i=i 



(19) 



(20) 



The weights fi are the fractional amounts of the total background due to the individual 
backgrounds, 



This allows all problems to be reduced to the case of one signal source and one non-signal 
source, i.e. one background. 

The solution of the measurement problem can be approached from a purely algebraic 
viewpoint. If e is a vector representing the experimental basis, with pass and fail axes, 
and p is the vector representing the physics basis, with signal and background axes, the 
measurement problem is written e = M p . When there are uncertainties in either of the 
basis vectors, or in the measurement matrix, the measurement problem is written 

(e + 5e) = (M + 8M) (p + dp) . (22) 

Finding the solution p with uncertainty Sp is a classic problem in linear algebra. The 
uncertainty 5p is known [[J to be limited: 

< ■ 7 i M L,„ l^ + ^l. (23) 



7(M) (\\SM\\ t \\5e 
P\\ ~ l- 7 (M)|M! [W 



where \\p\\ denotes the norm of a vector (or matrix) and 7(M) is a non-negative real scalar 
known as the condition number of the measurement matrix M, 

7(M) = ||M|| HlVr 1 !! . (24) 

If 7(M) is large, the measurement problem is said to be ill-conditioned. The condi- 
tion number is equivalent for both the maximum absolute column sum (the 1-norm) and 
the maximum absolute row sum (the oo-norm) of the measurement matrix, subject to the 
constraints of Equation [13|: 

^Tvn _ / ( 2 - ( £ + r ))/( £ " r ) when ( £ + r ) < 1 (ok\ 

71 ' ~\{e + r) j{e - r) when (e + r) > 1 " 1 ' 

The only way to avoid a measurement matrix with a large condition number is to avoid 
e r. In other words, large rejection factors lead to better conditioned measurement 
problems; better conditioned measurement problems lead to a smaller uncertainty dp in the 
quantities on the physics axis p. 

There are many different funtions of e and r that can be offered as a statistic to weigh 
the relative merit of one particular cut (with e\ and r{) versus another cut (with e 2 and r 2 ). 
Minimizing 7(M) of Equation |25| is only one strategy that can be used to search for a 'best' 



set of cuts. One other strategy may be to maximize the rejection factor R of Equation |10 . 
Another commonly encountered rule-of-thumb is to maximize e x R; the extra factor of e is 
introduced to account for the fact that the amount of signal in the subset of N pass events is 
directly proportional to e, cf. Equation ^a|. 

Figure [TJ shows the behavior of these statistics for the cases e = 0.8 and e = 0.2. Each 
is undefined for cases of r > e. Notice that strategies that rely on minimizing 7(M) see the 
relatively biggest improvement quickly as r takes values away from e, but that a strategy of 
maximizing the rejection factor sees the most improvement as r approaches 0, independent of 
the actual value of e. As expected, both methods favor values of e closer to one than to zero. 
None of these minimization or maximization strategies takes into account any uncertainties 
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: "r-r-r-i--i-. 

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 
rfficiency 



FIG. 1. The behavior of 7(M), R, and e x R as a function of r for two different values of 
efficiency e. Possible strategies for the preference of one cut over another is to maximize the 
rejection factor R or to minimize the condition number of the measurement matrix 7(M), see 
inserts. 

in the state-of- knowledge of e or r, so none of them can be considered an absolute statistic 
in deciding between one cut or another. 

While the linear algebra used to derive Equation ^ provides some insight into the mea- 
surement problem, ultimately it is unsatisfying in several respects. The most obvious lim- 
itation is that the upper limit on 5p is not clearly denned in terms of confidence intervals. 
Another drawback is that there are several possible choices for the norm of the measure- 
ment matrix M and the uncertainty <5M. Yet another difficulty is the common confusion 
that arises from attempts to assign uncertainties to a binomial measurement, such as what 
fraction of events pass or fail a cut. 

Bayesian probability theory provides a natural way to incorporate the knowledge, in- 
cluding the uncertainties, of the efficiency, the rfficiency, and the measured experimental 
results fp ass into a coherent statement about the state of knowledge of the physical signal 
fraction f S i g . Before describing the details of the solution to the measurement problem, the 
basics of Bayesian probability theory will be reviewed by considering its application towards 
binomial efficiencies. 
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III. A B AYES IAN DESCRIPTION OF BINOMIAL DATA 



Bayesian probability theory (BPT) interprets a posterior probability density function 
(pdf) as the state-of-knowledge of an experimental result given some set of prior beliefs in 
the possible values of the result, the prior pdf, and the likelihood function describing the 
measured results of the experiment. The source of the posterior is Bayes' theorem: 

prior pdf x experimental likelihood 

posterior pdf = . (26) 

evidence 

The maximum value of the posterior pdf is the most likely value, and the area beneath a 
particular interval along the posterior corresponds to the confidence that the true answer 
lies within the limits of that interval. 

A common problem encountered in the analysis of experimental results that is naturally 
described by BPT is the characterization of the uncertainties associated with the fraction of 
events that pass a particular cut f pass . It has long been known that binomially distributed 
quantities can be approximated by the normal (Gaussian) distribution, 

P(f P a SS ; fo, a) = -±= exp ( (/ °:/;r )2 ) • (27) 

The most likely value of the normal pdf is / = N pass /N Tota i , and the variance is a = 

\J fo(l — fo) I Nrotai ■ This approximation is only valid when N Tota i ^> and the mean f 
is not too close to the extreme values of one or zero. As fo approaches one or zero, the 
variance (as defined) approaches zero. It is not uncommon in experimental physics that one 
or both of these conditions is violated, leaving the approximation of Equation ^7] unusable. 
In particular, experimental results often present cases where zero events remain after a set 
of cuts is applied to a data sample. Figure shows normally distributed pdfs with the same 
fo but with different sample sizes N Tota i. Note that the tails of the Gaussian distribution 
can extend beyond the physical region < f pass < 1. 

Several authors have used the following posterior pdf to describe P(f paS s) in the 
interval < fo < 1: 

p( f .AT AT \ (NTotai "4 1) ! / „ \N pas3 / -i r \N Tota i~ N paS s (OQ\ 

r [.Jpass, -"-'pass, iv Total) | / \ | [.Jpass) [ l Jpass) ■ x^ ) 

1 " pass • y 1 ^ Total ~ 1 1 pass ) • 

The most likely value for this posterior is simply the fraction of events that pass the cut, 
fo = N pass /N Total . Figure 0b shows Equation for different values of N Totah each with the 
same most likely value of fo = 0.2. The origin of this posterior is the use of a uniform prior 
over the physical region, 

p(f N / 1 if < fpass < 1 (?Q] 
\Jpass) j q otherwise • i ZJ ) 

The experimental likelihood is the binomial distribution, 

P( f \ -^Total ■ / p \N pas3 /-, r \N Tota i—N p ass ('Id') 

r \Jpass) — at | /at _ at \ | KJpass) y 1 Jpass) ■ l ou ) 

-''pass • y^Total ^pass) • 
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FIG. 2. Different posterior pdfs from sample sizes of 5, 10, 20, and 50 events are shown. Plot 
(a) shows normally distributed pdfs; only points within the physical region are shown. Plot (b) 
shows Bayesian posterior pdfs when a uniform distribution is used as the prior. The pdfs of (a) 
and (b) each have a most likely value of f paS s = 0.2. Plot (c) shows Bayesian posterior pdfs when 
Jeffreys' prior is used, where the fraction of events that pass the cut is 0.2 for each pdf. No part 
of any pdf in (b) or (c) extends beyond the physical region < /< 



pass 



< 1. 



The evidence, or marginalization term, normalizes the posterior to unit area, and is found 
by integration: 

/+oo 
(prior x experimental likelihood) df pass . (31) 
-oo 

In the case of the uniform prior, the evidence term for this experimental likelihood distri- 
bution is equal to (N Tota i + 1)~ . 

It is possible to construct a different posterior pdf P(f pa ss) with a different choice of 
prior, e.g., 



1 \J passi ly passi ly Total ) / , T i \ I {AT AT \\ \JP ass ' \ J pass j 

(/V pass — L) ■ Total ~ ^pass 



(32) 

arises if Jeffreys' (divergent) prior is used: 

■pi f \ J 1/ fpass if < fpass — 1 /oo\ 

nipass)-^ q otherwise " l^J 
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The evidence term in this case is equal to (N pass )~ . The most likely value of Equation [32 
is fo — (N pass — 1)/ (N Tota i — 1), which approximates the most likely values of Equations |7| 
and |28] as for large sample sizes. Figure 0c shows the evolution of Equation [32] as the sample 
size is increased while the fraction of events that pass the cut is held constant. 

The posterior pdf of Equation [32] is included for completeness and will not be used in 
this solution to the measurement problem. Notice that the use of a divergent prior excludes 
cases of A^ ass = 0. Jeffreys' prior would be used in those cases when an experimentalist 
claims complete ignorance of the efficiency of the cut in the absence of any surviving events; 
i.e., if zero events pass the cut, the experimentalist who favors Jeffreys' prior will not claim 
that any event will ever pass the cut. This is obviously not satisfactory when attempting 
to set upper limits on a data sample with zero surviving events. In such a case, if the 
experimentalist is comfortable with setting the most likely value of the posterior P(f paS s) at 
fo — when N pass = 0, a flat prior should be used. 

The three different posteriors, Equations ^7], and [32], demonstrate a feature of BPT; 
as the sample size increases the posterior pdf becomes less sensitive to the particular choice 
of the prior pdf. Furthermore, as long as the most likely value of the distribution is not 
too close to its limiting values, as the sample size increases, the Gaussian distribution more 
closely approximates the posterior pdfs of Equations |28] and [30. Figure 0a shows the three 
different posteriors in the case of small sample size; Figure [|b shows the same distributions 
from a twenty times larger sample. 
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FIG. 3. The three different posterior pdfs for a sample of events where 20% of the events pass 
the cut. Plot (a) shows the posteriors for a sample size of five events. Plot (b) shows the posteriors 
for a sample size of 100 events. 
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In this paper the notation P(x) represents a pdf of no particular form. For complete 
generality, the pdfs for the efficiencies e and r will be written as P(e) and P(r); it should be 
assumed that for the remainder of this paper each is a shorthand representation for a bino- 
mial posterior described by Equation |28|. For other cases, the notation P(x; x pass , XTotai) will 
be used to describe a binomial posterior of the form given by Equation |28|, while P(x; Xq, u x ) 
describes an explicitly Gaussian pdf of the form given by Equation |27|. Even though the pos- 
terior pdf represents the complete knowledge of the particular distribution of possible values 
of a measured quantity, it is common to summarize the results of an experiment, i.e. the 
posterior, with only a few numbers. Typically an experimental result will be quoted as the 
most likely value (the mode of the posterior) with an upper and lower limit such that the 
most likely value is contained within the limits at some confidence level. For multi-modal 
posteriors it may be more appealing to quote the mean of the posterior rather than the 
modal value. Binomial problems do not of themselves give rise to multimodal posteriors. 
For the purposes of this paper, the most likely value of a posterior will be quoted; the most 
likely value of a posterior P(x) will be represented Xq. When error bars for a confidence level 
a are quoted, they describe the shortest interval about the most likely value that contains 
area a beneath the posterior pdf. Figure [| shows the a = 0.683 and a = 0.955 confidence 
intervals for a Bayesian posterior constructed by Equation ER 
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FIG. 4. The Bayesian posterior pdfs (uniform prior) for a sample of 10 events of which 2 events 
passed the cut. The hatched region of plot (a) represents the 0.683 confidence level interval about 
the most likely value of f paS s = 0.2. The hatched region of plot (b) represents the 0.955 confidence 
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IV. A SOLUTION FOR THE MEASUREMENT PROBLEM 



In Section [ITI] it was shown that BPT can be used in binomial problems to construct 
posterior pdfs that exist completely within the allowed physical region. The application of 
cuts to finite data sets is a binomial problem: events in a data sample will either pass or 
fail a particular cut. While this classification is completely natural during the course of an 
experiment, the binomial quantity of interest is not the fraction of events which pass a cut, 
but the fraction of signal events in the original sample. A useful theorem of BPT provides a 
means of constructing a physical posterior pdf P(f S i g ) from the experimental posterior pdf 

P\fpass) ■ 

Generally, if a variable y is a function of variable x, y = f(x), an existing posterior 
distribution P(y) can be used to construct a desired posterior function P(x) simply by 
replacing y in P(y) with the functional form of x and multiplying this posterior by the 
Jacobian 01: 



P(x) = P{y{x)) x 



dy 



dx 



(34) 



In the measurement problem, this change of variables takes the form: 

dfpass 



P(fsig) P(fpass(fsig)) X 



sig 



(35) 



so that 



P(fsig] £, r, N pass , N To tai) = P((e ~ r)f sig + r; N pass , N To tai) x |e 



(36) 



When Equations ^ and |17a| are used, the posterior pdf that describes the amount of 
signal in the original sample is: 



P[fsig] £ > r i Np ass , N^otal) 



\e - r\ {N Total + 1) ! 



N pa ss\ (N 



Total 



— N 



x 



pass 



((e - r)fsr 9 + r) Npass (l - ((e - r)f sig + r)) 



^Total— Npass 



(37) 



The efficiency e and the rfficiency r should be considered nuisance parameters as posterior 
pdfs P(e) and P(r) can be constructed, according to Equation |28] in Section |TJ, from 
independent control samples. Once these posteriors are known, e.g. P(e;e pa s S ,£Totai), the 
nuisance parameters can be integrated away: 



P(fsig] Np a ss, Nxotalj 



f 1 de f 1 dr P{e) P(r) P (f sig ; e, r, N paS s, 
Jo Jo 



^Total) 



(38) 



Equation |38] is the solution to the measurement problem. The efficiency e, the rfficiency r, 
the size of the sample N Tota i, and the number of events which pass the cut N paS s all contribute 
to the posterior pdf for the fraction of the signal events in the original sample. This posterior 
is completely Bayesian: it will provide a most likely value for the signal fraction; it also 
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allows for the natural construction of confidence intervals. Alternately, Equation [38] could 
be written in terms of P(fbk g ), because of the trivial relationship f sig + f bkg = 1. 

This method of constructing P(f S i g ', N pass , N Tota i) from Equation |38] allows the state-of- 
knowledge of e and r to enter the understanding of f s i g in a natural way. The efficiency and 
rfficiency originate from independent control samples; their most likely values depend on the 
particular cut used. It may be the case that e and r have modal values Eo, tq which lead to 
a measurement matrix with a small condition number, q.v. Equation but that they are 
found from such small diagnostic samples that the posterior pdfs P(s), P(r) are very broad. 
This will lead to a broader distribution for P(f S i g ) than the case of very precisely known e 
and r. Often experimentalists face the dilemma of diverting bandwidth from the recording 
of possible signal sources to the task of increasing the size of control samples, especially when 
suffering from limited statistics in one or more control samples. Equation |3^ introduces an 
easy way to evaluate control samples of different sizes. See Example 2 below. 

It should be noted that even in cases where e and r are known very precisely, small 
sample sizes may lead to a pdf P(f paS s) which has non-zero values outside the physical 
region r < f pass < e. In such cases, the integral of P(f S i g ) over the physically allowed values 
< fsig — 1 will be less than unity. It is useful to define an overall confidence level for the 
experiment, 



The value a experiment is the confidence that the observed values of N pass and N Tota i are 
consistent with the knowledge P(e) and P(r). Equation |39] also defines the maximum 
confidence level that can be quoted for the posterior pdf P(f S i g ). Since P(f S i g ) is restricted 
to < f S ig < 1, (1 — a experiment) is the fraction of the posterior that could not be constructed 
because part of P(f S i g ) lies beyond the physical boundaries. Recall that by the construction 
of Equation p8| , 



While the Bayesian posterior pdf P(f pass ) is normalized to unity, representing complete 
certainty that the fraction of events which pass a cut is between zero and one, the posterior 
pdf P(f S i g ) is not normalized, except by the Jacobian as seen in Equation |34j. A case 
of aexperiment less than one simply implies that larger data and/or diagnostic samples are 
required to increase the confidence that f sig is within the expected physical region. In cases 
where either e or r are imprecisely known, the overall confidence of the experiment may 
be small if the fraction of events which pass the cut is very close to either Eq or r$. In the 
event of an overall experimental confidence level of much less than one, the experimentalist 
is encouraged to alter the cut such that f pass is not too close to either e or r, or to increase 
the sample size N Total . 

Figure ^ shows the equivalence of Bayesian confidence level intervals and those con- 
structed by differences in the log of the posterior J5|. In cases of a eX periment = 1, the posterior 
is zero for all values outside the physical region: here the log-likelihood method will always 
provide an interval completely within the physical region. In the case of ot eX periment < 1, the 
log-likelihood method may not be able to set one or both limits within the physical region. 




(39) 




(40) 
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See the insert of Figure |5|b for Example 1 below for an example of an experiment that could 
not set bounds with a confidence level of greater than 93%. 

It should be noted that if for some reason the posteriors P(e) or P(r) are constructed 
by some formalism other than that of Section |T| which causes one or both of them to be 
multimodal, the posterior P(f S i g ) may become multimodal. This would be a very unlikely 
circumstance that arises only through the drastic intervention on the part of the experi- 
menter. 

This solution to the measurement problem presents the results as a fraction of signal 
events f S i g in the original sample of N^otai events. It may be preferrable for certain calcula- 
tions, such as cross section measurements, to find the total number of signal events N sia . It 



is trivial to use Equation |15a| to perform the change of variables described by Equation |34 



P(N, 



sigj 



P 



( N > 



sig 



\N> 



Total 



N 

) 1 "pass > 



Nrotal 



N' 



Total 



(41) 



The posterior P(N sig ) is defined on the interval < N sig < N Tota i. 

Equation |38] is not only a useful method for interpreting the results a single experiment, 
but it can also be used as an unbiased tool to evaluate the possibility of applying different 
sets of cuts to the same original data sample, see Example 1 below. This method has the 
further ability to quickly judge the possible improvements in a result from increased sample 
sizes, both for the data and control samples. Possible improvements may take the form of a 
larger value of a. experiment-, a shorter confidence level interval about the most likely value of 
f sig , or both. 



14 



V. THE MEASUREMENT OF SIGNAL IN A SUBSAMPLE 



If an experiment is designed to extract a signal-rich sample of events by applying a 
cut, it is unusual to quote the fraction of signal events in the original sample that may be 
dominated by background events. Rather than quoting the signal fraction from the Nrotal 
event sample, it may be more useful to quote either the signal fraction or the number of 
signal events in the subsample of S pass events that pass the cut. Recall that the equation 
for Sp ass is, from Equations ^ and |14a| , 



S, 



pass 



e — r 



(jv; 



pass 



e ■ N, 



past 



e — r 



r ■ N Tota i) 
r \ 



fi 



pass , 



Rearranging the above, f paS s can expressed as a function of S, 



pass ■ 



ft 



e-r-K 



pass 



pass 



£ ■ N p ass - (S 



(42) 



(43) 



T) Spass 

The number of signal events in the sample of events which passed the cut is restricted to the 
interval < S paS s < N pa ss- As in Section [TV] , a posterior describing some fraction of events 
g will be used so that < g < 1. The definition of g for this problem is 
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S 



pass 



N 



pass 



Equation |3| is used to express f pa ss as a function of g: 

fpass 



e ■ r 



(44) 



(45) 



e - (e - r)g 

Just as it was possible to change variables in order to construct a posterior pdf P(f s i g ) from 
the form of the posterior for P(f pa ss), it is possible to construct of P(g). The Jacobian from 
Equation is 



dfpass 




e ■ r (e — r) 


dg 




(e - (e - r)gf 



The posterior pdf P{g) is then 
P(g; e, r, N pa s S , N Tota i) = P 



e ■ r 



r)g 



; Npass, Nrotal ) X 



e ■ r (e — r) 



(e - (e-r)gY 



As before, the nuisance parameters e and r should be integrated away: 
P{g; Npass, AW) = £ de £ dr P{e) P{r) P {g; e, r, N pa ss, N 



Total , 



(46) 



(47) 



(48) 



Equation |48| can be converted into a posterior for S pa ss if a change-of-variables similar to 
that of Equation El] is performed, so 



P ( Spass Npass , Nrotal j 



£ de £ dr (NpassT 1 P(e) P{r) P {g; e, r, N pa ss, N 



Total J 



(49) 



The posterior of Equation [|9] is defined for the interval < S paS s < N paS s- Similar posteriors 
can be constructed to describe Sf ai i, B paS s, or Bf a a. 
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Example 1: The discovery of the top quark 



In 1995 the CDF M and D0 |7[ collaborations reported conclusive evidence for the 
process pp — > it + X production at the Fermilab Tevatron. The key to this discovery was 
the ability of each experiment to devise cuts for the efficient selection of it signal events 
from a parent sample of events with a high transverse momentum lepton (from the decay of 
a boson) and three or more jets. This important discovery can be used to illustrate the 



use of Equation |38 



The CDF experiment claimed discovery with 67 pb _1 of integrated luminosity using 
analyses based on two different methods of discriminating signal from background through 
the identification of b jets. The first method (SVX tag) identified b jets by the reconstruction 
of secondary vertices within a silicon vertex detector. The second method (SLT tag) involved 
the reconstruction of soft leptons (here, electrons or muons) from the semileptonic decay of 
b quarks. As expected these two methods had different efficiencies for both the signal and 
background, and different numbers of events passed each cut. 

Table | contains the relevant information as published by the CDF experiment. Also 
included is the CDF result || using SVX tags from a Run 1 data set corresponding to 109 
pb _1 of integrated luminosity. Even though the SVX and SLT methods have very different 
efficiencies, the different methods show good agreement in the fraction of signal events in 
the original sample of 203 W ± 3 (or more) jets events. Notice the agreement between the 



measured number of signal events S paS s from Equation f49] and the reported CDF numbers. 



TABLE I. The discovery of the top quark at the CDF experiment: Shown are the results of 
the solution to the measurement problem as applied to the published values of e and r. Compare 
the solution's S pass to the published CDF value. Also shown are the CDF results from a Run 1 
data set corresponding to a larger integrated luminosity of 109 pb~L 





Nrotal 


Vp ass 


£ 




) 


r 


(%) 


fsig 


Spass 


C^exp 


CDF 


SVX (67 pb _i ) 


203 


27 


42 


± 


5 


3.3 


±0.1 


n 25 +aus 

u - zo -0.07 


22.5"_ 


h2.3 
-1.9 


1.0 


20.3 ±2.1 


SLT (67 pb" 1 ) 


203 


23 


20 


± 


2 


7.6 


±0.1 


U - Z4 -0.18 


13.2: 


-4.6 
-6.0 


0.93 


7.6 ± 2.0 


SVX (109 pb^ 1 ) 


322 


34 


42 


± 


5 


3.3 


±0.1 


n i 0+O.O6 


26.0: 


1-2.3 
-2.7 


1.0 


25.5 ± 1.7 



Figure |5] shows the state-of-knowledge of the efficiencies for signal and background, as 
well as the knowledge of the fraction of events which survive the different 6-tag requirements. 
In Figure it is easy to see how the well-separated pdfs P(r), P(e), and P(f pa ss) of the 
SVX measurement give rise to a more precise measurement of f s i g . The overlapping pdfs of 
the SLT measurement shown in Figure [5]b return a similar most-likely value of f S i g , but the 
overall precision of the SLT measurement is worse. The poor separation of the pdfs in the 
SLT measurement leads to an overall experimental confidence level of 93%, compared to the 
100% confidence of the SVX measurement. This can be seen graphically by comparing the 
inserts of Figure [j| The SVX result is completely within the physical region < f S i g < 1, 
while 7% of P(f S i g ) for the SLT result lies outside the physical region. 
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FIG. 5. The probability density functions of interest for CDF's top quark discovery for the (a) 
SVX-tag sample, and (b) SLT-tag sample. The inserts show the measured fraction of signal events 
in the entire data sample; the hatched regions show the 0.683 CL interval about the most likely 
value of f s ig. Notice that the overlapping pdfs of (b) cause a less precise measurement of f S i g . 



Example 2: Using W (— > e v) production as a luminosity monitor at Tevatron Run 2 



It has been suggested |J that the two major Tevatron collider experiments count the 
number of events from the process pp W ± (—> e ± u) + X during Tevatron Run 2 and use 
the theoretical predictions for the cross section times branching ratio (a S ig na i) to measure 
the integrated luminosity (£). It is hoped that such a l W counting' method can measure the 
integrated luminosity recorded at each experiment more precisely than the approximately 
4% precision [f| [|l0j| used in the Run 1 physics results: 

£ = ^ pass . (50) 

£ • A (^signal 

The variables S pass and e have the same definitions as in Section |J; A represents the kine- 
matic and geometric acceptance of the detector used to collect the signal events. It is more 
natural to reformulate Equation |50| in terms of the total number of signal events recorded, 

iV • 

C = 819 , (51) 

^signal 

so that the measured fraction of signal events is a function of the integrated luminosity: 

r _ N sig ^ C ■ A ■ s ig na l 
M Total M Total 
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Following the derivation of Equation 



P{£) 



p I ^ & signal _ *j 

IVn 



Nrotal 



i Total 



A ' O signal 



Total 



The posterior P(C) is defined on the interval < C < C max , where 



Total 



A- ' & signal 



(53) 



(54) 



The variables of Equation |53| are: the number of total events (NTotai) prior to a selection 
of signal events (N pass ) by a cut with some efficiency (e) and rfficiency (r); the acceptance of 
the detector (A); and the theoretical cross section times branching fraction (<r S ignai)- It will 
be assumed that the acceptance of a given detector can be known to an arbitrarily small 
precision through the use of large Monte Carlo data sets and a complete detector simulation, 
i.e.: 



P(A) 



1 if A = A 
otherwise 



(55) 



The knowledge of the efficiency of the selection criteria, P{e), will come from Z° — > e + e~ 
decays recorded during data taking; for Tevatron Run 2 the size of this sample will be twenty 
times the approximately five thousand such events recorded during Run 1 JTO] . 

Ignoring the uncertainty in the integrated luminosity, the background fraction in the 
sample of events which pass the cuts (r x fbkg) wa s the dominant source of uncertainty in 
the measured pp — > W /± (— > e ± v) + X cross sections from Run 1. It is impossible to predict 
the exact amount of background that a given experiment will have prior to data collection, 
so the most important experimental question facing an experiment that wishes to use a 
process like inclusive W ± production is: How much diagnostic data is needed to understand 
the background to an arbitrary degree of accuracy? 

In order to perform such a study, it is necessary to assume that each Tevatron experiment 
will be delivered 2 fb _1 of data, and that e, r, and A will be close to the currently reported 
values from Run 1. It will be assumed that there is no uncertainty in the theoretical cross 
section times branching ratio o 'signal] the value of the reported D0 measurement [ID will be 



used for a signal- The size of the total data sample N Tota i will be the sum of the number of 
signal events (N S i g ) and an arbitrary number of background events (N^g) depending on the 
value of r x f bkg : 



N, 



bkg 



(r x f bkg ) 



N, 



sig 



{r x f bkg ) 



(56) 



Table [Q] lists the assumed values for the Run 2 W ± {— > e^v) cross section measurement 
for the D0 experiment. Table [TTT] shows the lcr confidence limit in the measured integrated 
luminosity that can be expected for a given experiment for different control sample sizes 
irTotai)- The QCD background fraction in the inclusive W sample is known to vary with 
instantaneous luminosity and trigger definitions [[K]] ||11|| ; Table |I| shows the effect of dif- 
ferent amounts of background on the precision of the luminosity measurement. Note that 
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TABLE II. The assumed values for the Tevatron Run 2 pp — > W (— ► e^v) + X cross section 
measurement at D0 . The measured value of the Run lb inclusive boson production cross 
section times branching ratio from the D0 experiment is used in place of a theoretical value for 
° 'signal- The value of exotai is an estimate of the total number of Z° —* e + e~ events that will 
available for the measurement of the efficiency. 



Experiment 






^Total 


c 


A 


G signal 



D0 0.43 0.70 108 000 2 fb" 1 0.465 2310 pb 



the figures for the D0 experiment assume that both the central (CC) and endcap (EC) 
calorimeters are used; if only the CC is used the D0 experiment can expect a more precise 
measurement of integrated luminosity since the background fractions in the central region 
should be approximately one-fifth the value in the EC regions, even though 70% of the 
acceptance for W — > e u events is in D0's central region. 

It should be noted that the upgraded D0 central tracker to be used in Run 2 will almost 



certainly have different values of eo and tq than used here. Nevertheless, Table |TJ shows that 
even with moderately sized diagnostic samples (one-tenth the size of the final Z Q — > e + e~ 
sample) it should be possible to measure the integrated luminosity to a precision of better 
than 1% with this method, assuming that the theoretical uncertainties can be kept at or 
below this level of precision. 

TABLE III. Shown are the la confidence level intervals about the nominal Tevatron Run 2 
integrated luminosity Cq = 2 fb _1 , as a function of the amount of background and the number of 
diagnostic events available to measure P(r). 



Experiment 


^0 X fbkg 


TTotal 


la interval about Co 


D0 (nominal bkg) 


0.064 


1 000 


± 0.017 fbT 1 


D0 (nominal bkg) 


0.064 


10 000 


± 0.012 fbT 1 


D0 (nominal bkg) 


0.064 


100 000 


± 0.011 fb" 1 


D0 (less bkg) 


0.030 


1 000 


± 0.013 ftr 1 


D0 (less bkg) 


0.030 


10 000 


± 0.012 ftr 1 


D0 (less bkg) 


0.030 


100 000 


± 0.011 ftr 1 


D0 (more bkg) 


0.100 


1 000 


± 0.026 ftr 1 


D0 (more bkg) 


0.100 


10 000 


± 0.012 ftr 1 


D0 (more bkg) 


0.100 


100 000 


± o.oio ftr 1 
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VI. THE CONFIDENCE LEVEL FOR THE POSSIBLE DISCOVERY OF A 

SIGNAL 



It may be the case that an experiment provides a sample of NTotai events, and the 
expected number of events \x B is modeled by a some pdf, e.g. P{\i B \ B, <j b ), where the most 
likely value B is less than the observed number of events. In such a case, the experimentalist 
may believe that the excess, NTotai — B, is due to a real signal rather than a statistical 
fluctuation. The confidence level a excess that the excess is due to something beyond the 
expected background can be defined |L2| as 



or 



00 roc e~^ B a N 

C^excess = 1 ~ J2 / d ^B P {^b) , (57) 

JV— n Total 



aexcess = L d ^ B ]y7~^ P ^b) ■ (58) 



7V=0 



In the case of an excess, it should be assumed that P{^ B ) = for all /i^ > N Tota i, so 
Equation ^ can be written 

aexcess = V / dfx B — — P(//b) ■ (59) 

It is natural to try to enhance the significance of a possible signal by reducing the 
expected background [i B . This is done by applying a cut on the sample of N Tota i events. A 
cut is usually chosen such that the rfficiency of the cut on the background events (here, the 
anticipated events described by P{[Ib) ) is small, while the expected efficiency of the cut on 
the possible signal is large. Equation |14a| can be turned around; 

e ■ N sig = N pass - r ■ N bkg . (60) 

In order to claim a positive signal, there are two conditions. The trivial condition is that 
the efficiency e is non-zero. The more important condition is that N pass > r ■ N^g. If there 
is an excess, the significance of the excess in the sample of N pass events incorporates the 
knowledge of r and fi B through the pdfs P(r) and P(fi B ): 

N P ass-l - Nj , otai e -(r./* fl ) ( r . „ \# 

a d iscovery = V / / dfJL B T7. P{Pb) P\t) . (61) 

^0 JQ JO N\ 



^vass can be defined as the minimum number of events N pass from Equation |61] that 



"pass uiLiiuiLLuii iiu-iiiu^j. ^vm»t, +" P ass 

guarantees a confidence level adiscovery m the signal. Just as in Equation [3^, the knowledge 
of the rfficiency P(r) plays a critical part in the solution to this problem. The attempt 
to enhance a possible signal is a common exercise that is often fraught with difficulties; 
Equation |H] is an unbiased tool to estimate the minimum number of events which must 
survive any new cut used to extract a possible signal. 
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Example 3: Discovery of a Higgs boson at CDF? 



In 1997 the CDF experiment [HI reported a slight excess of events in the process pp — > 
W + 2 jet events in events where the W ± boson decayed to either an electron or muon, 
and one of the jets in the event was identified as coming from a b quark decay by either the 
SVX or SLT tagging method. The CDF data is summarized in Table |V[ 

TABLE IV. The CDF results reporting a possible excess of observed + 2 jets sample 
where one or both of the jets has been 6-tagged. The significance of the excess (a excess ) has been 
calculated using a Gaussian distribution with mean B and width as- 

Sample Number Observed Background Estimate [B ± erg) ct excess 
W ± + 2 jets (no tag) 1527 

W ± + 2 jets (one tag) 36 30 ± 5 0.49 

W ± + 2 jets (both tagged) 6 3.0 ± 0.6 0.90 



An interesting but yet unobserved Standard Model process that has the experimental 
signature of a W ± boson and two 6-jets is associated production of W ± boson and a Higgs 
boson 1 13], i.e. pp — > W ± + H°. It is reasonable to try and enhance the 49% confidence level 
excess in the CDF single-tag sample by requiring both jets to be tagged; Table [TV] shows 
that this requirement increases the confidence that the excess is due to a signal beyond 
expectations to 90%. Based on the CDF results, it is possible to estimate how many events 
would pass the double-tag requirement in order to claim a higher significance. From the 
data shown in Table |IV| it can be assumed that the rfficiency of the double-tag requirement 
is to be 10%. Table |V| shows the minimum number of events from the single-tag sample that 
would have to pass the double-tag requirement in order to claim excesses at the 2a and 3a 
levels. Also shown is the case where only 5 events pass the double-tag rquirement, which 
has an ot excess confidence level of only la. 



TABLE V. 


Shown 


are the minimum number 


of events N^ c s 


that must pass a cut 


in order to 


claim discovery of a si; 


mal at a £ 


;iven confidence level ctdiscovery, 


for Example 3. 




Nrotal 


B 




Vpass 


TTotal 


(^discovery 


ivrdisc 
I ^pass 


36 


30 


5 


10 


100 


0.683 


5 


36 


30 


5 


10 


100 


0.955 


8 


36 


30 


5 


10 


100 


0.997 


11 



21 



VII. CROSSCHECKS FOR THE POSSIBLE DISCOVERY OF A SIGNAL 



If after the application of the cut a significant excess is observed, it is straightforward to 
use the measurement problem as a cross check of the possible discovery. In order to perform 
such a cross check, some knowledge P{e) of the efficiency e of the cut on the (possible) 
signal is necessary. If nothing is known about e except that it was large enough to provide 
discovery, i.e. e > r, the uniform pdf prior to be used for P(e) is 



P( ) i ^ when £ < r 
| (1 — r)~ when e > r 



Otherwise, if there is some a priori knowledge assumed about the possible signal, there 
may be more informed knowledge of e, perhaps from a Monte Carlo simulation. With the 
knowledge of the efficiency and the rfficiency for the cut, it is reasonable to assess the overall 
confidence that fewer than iVfef events will fail the cut, where iVfef is the number of events 
which are removed from the sample by a cut designed to enhance the possible signal. The 
expected number of events that will fail the cut \if ai i is 

Vfaii = (1 - e) N Total + (e - r) fi B . (63) 

Recall that for a possible discovery e > r, so fif a u will never be negative. It is important 
to recognize that from the substitution of = Nxotai — Nf^ into Equation [II], a small 
value of Nf^u which leads to large value of acdiscovery ma Y be inconsistent with the original 
description of the expected background. 

The confidence level (3 that fewer than Nf a u events will fail the new cut given the expected 
background ji B is 

N J™ 1 f i r N T otai r 1 e~^ ail ., 

P=Y, dr dVB de -f^ P(e) P(p B ) P(r) . (64) 

Jo Jo Jo JMl 

The variable (3 is used here because in cases where the efficiency and rfficiency are known 
very precisely, j3 expresses the confidence that the initial description of the background 
distribution can still accommodate the new excess. In other words, j3 is a measure of how 
likely it is for N pass (or more) events to remain after the application of the new cut, given the 
assumption that the original sample of events contains both a new signal and the expected 
background. 

Notice that once Nxotai is fixed, the number of events N^ s f s necessary to claim a discovery 



with a confidence level otdiscovery is a function of r and \ib only. Once a discovery is claimed, 
the confidence level (3 in the original background description depends not only on r and fi B , 
but also e. This reflects the fact that the efficiency e of a cut has no meaning for a sample 
devoid of signal events. In the limiting case of Equation |4] where both e and r approach 
unity with complete certainty, N/ a u approaches zero and (3 approaches an upper limit of 
1; if no further cut is placed on the sample, there is complete confidence that the original 
background description accomodates the observed excess. A value of (3 « 1 implies that 
P(Hb) can completely accommodate an excess in N pass ; (3 « implies that the description 
of the background cannot accommodate such an excess. 
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Example 4: The degree-of-belief for a Run 1 Higgs discovery 

The CDF data used in Example 3 is a case where a double 6-tag sample of W ± + 
2 jet events shows an excess of observed events over expectations at the 90% level, see 
Table [IV]. For simplicity, and to insure that the mathematical description of the number 
of background events never has a value greater than the observed number of events, the 
distribution of expected events P(hb) will be modeled here as 

PM = I (2 " GB ~ ~ B + ° B . (65) 

I U otherwise 

This model for the background lowers the significance of the excess in the double-tag sample 
from 90% to 87%, but it will serve as an approximation to a Gaussian description of the 
expected background. Three different cases will be considered for the possible new signal 
that is causing the excess in the double-tag sample. The first possibility is that the double- 
tag requirment has an efficiency of e§ = 0.33, which is the reported efficiency for a + H° 
signal in the Run 1 CDF detector. The second possibility is that the efficiency is much 
higher, e — 0.90. The final possibility is that there is no a priori knowledge about the 
nature of this new signal, the knowledge of the efficiency of the cut on the new signal is only 
that this efficiency is always greater than the rfficiency of the cut on the background, but 
that it has no preferred value, q.v. Equation ^2[ 

TABLE VI. Shown is the confidence (3 in the original description of the background P(/i_e) 
assuming that N£™g events pass the cut with efficiency P (e). The first four rows assume a signal 
efficiency of 33%; the second four rows assume an efficiency of 90%. In these cases e pass and ETotai 
were chosen such that the width of P(e) is approximately 15%. In the final four rows a uniform 
pdf for P(e) is assumed. The entries with iVf^ = 6 correspond to the observed CDF results. 
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0.683 
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33 100 


0.76 


31 


0.57 


0.868 
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33 100 


0.84 
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0.955 
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33 100 


0.84 
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0.36 


0.997 
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33 100 


0.58 
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0.18 


0.683 


5 


90 100 


0.77 


31 


0.77 


0.868 


6 


90 100 


0.87 


30 


0.72 


0.955 


8 


90 100 


0.97 


28 


0.59 


0.997 


11 


90 100 


1.00 


25 


0.39 


0.683 


5 


uniform P(e) 


0.70 


31 


0.64 


0.868 


6 


uniform P(e) 


0.77 


30 


0.58 


0.955 


8 


uniform P(e) 


0.82 


28 


0.45 


0.997 


11 


uniform P(e) 


0.76 


25 


0.26 



Table |VI] shows the results of the cross checks for the cases under consideration. The 
first part of the table describes the case where the efficiency of the possible signal is the 
value of the double-tag efficiency for W ± + H° production suggested by the CDF analysis. 
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The observed excess seen in the remaining 6 events has a statistical significance of 90%. The 
likelihood that the original description of the background (30 ± 5 events) can accomodate a 
possible signal with such an efficiency is 50%. Because any discovery must now include the 
description of the signal efficiency P(e), there is a smaller overall confidence level in the new 
result, here 84%, reflecting the fact that part of P(f pass ) overlaps the pdfs P(r) and P(e). 

Figure §a is a graphical representation of the cross check of the CDF result for the quoted 
efficiency of eo = 33%. The areas of P(f paS s) that overlap P{r) and P{e) cause the value of 
(^-experiment to be less than unity. Although the insert of ||a shows some agreement between 
the original background model P(^b) and P(fbkg) from Equation it is likely that the 
original background description P(fis) is too narrow, given the experimental results. 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 



FIG. 6. Shown are the relevant pdfs for the cross check of a possible H° — ► bb discovery at 
the CDF experiment in associated + H° production. Plot (a) shows the fraction of the 36 
events which pass the double 6-tag requirement as a hatched histogram superimposed upon the 
description of the efficiency for signal P(e) and background P(r). The insert of (a) shows the pdf 
P(fbkg), as a hatched histogram, that best describes the observed number of events N pass . This is 
to be compared with the assumed description of the background P{(jlb)- Plot (b) shows the results 
for a hypothetical data set twenty times the size of the Run 1 result. 

The rest of Table [VT| shows that it is more likely to see a larger number of signal events 
if the efficiency of the cut on the new signal is larger, i.e. 90% instead of 30%. The last four 
lines of Table |VI| show that even if no knowledge about the efficiency of the possible new 
signal is claimed (except that the cut is more efficient for the signal that for the background) , 
then the consistency (3 of the initial background description to accommodate the observed 6 
events is 58%, slightly higher than if the double-tag signal efficiency is described by a more 
precise, but overall smaller, value of eo = 33%. 
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Tables [VI] illustrates some important trends in attempts to increase the confidence level 
of a possible signal by imposing a further cut. The most important result is that knowledge 
of the cut efficiency for the possible signal is useful but not necessary when trying to claim a 
discovery. For example, if the CDF collaboration chose to assume a uniform pdf P(e) for the 
efficiency of the double-tag cut on any new signal, the confidence level a excess is still 77%. 
As in all measurement problems, a higher efficiency is better than a lower efficiency. This 
is important for increasing both the overall confidence level of the experiment a exper i men t 
and the confidence j3 that the new cut preserves the signal in a manner consistent with the 
original background estimate. 

Example 5: Extrapolations for a Run 2 Higgs discovery 

The results of Examples 3 and 4 should not discourage anyone from looking for a similar 
excess in a Tevatron Run 2 data set. It is trivial to scale the number of Run 1 events 
(Nxotai) from Example 3 by the expected increase in integrated luminosity, 2 fb~ for Run 
2, and solve Equation |ET| for the minimum number of events N pass that must survive a 
double-tag requirement in order to have a significant excess at some arbitrary confidence 
level. Table |V1I| has the results for a twenty times larger sample of events, assuming the 
same knowledge P{hb) and P(r) used in Examples 3 and 4. 

TABLE VII. Shown are the minimum number of events Np™° that must pass the double-tag 
requirement in order to claim discovery of a signal at a given confidence level otdiscovery , for Example 



5. 



^Total 


B 




Tpa,ss 


fTotal 


(^■discovery 


Ardisc 
Jv pass 


720 


600 


100 


10 


100 


0.683 


74 


720 


600 


100 


10 


100 


0.955 


104 


720 


600 


100 


10 


100 


0.997 


130 



Just as was done with the real data in Example 4, it is possible to test the outcome of 
the Run 2 experiment with different assumptions for the efficiency of the double-tag cut on 
any possible signal. The results are collected in Table |VIII| . Figure §d plots the results of 
an experiment with an assumed efficiency so = 33% for the signal where 130 events survive 
the double-tag cut, corresponding to a 3a excess in the original sample. 

The insert of Figure |B|b shows good agreement between P(fbkg) an d the broad P(^b) 
used in Example 4. This implies that given the present knowledge of the efficiencies and 
the known processes, described by P(//b), that contribute events to the W ± + bb sample, 
it would not be surprising to observe a 3a excess in such a data sample from Run 2. The 
ultimate interpretation of such an excess should come from an improved understanding of 
P(s), P(r) and P(ji B ). 
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TABLE VIII. Shown is the confidence (3 in the original description of the background P(ps) 
assuming that Np a s s c s events pass the cut with efficiency P(e). 



^discovery 


Ajdisc 
JV pass 


£pass 


^Total 


^experiment 




P 


0.683 


74 


33 


100 


0.46 


31 


0.81 


0.955 


104 


33 


100 


0.86 


28 


0.51 


u.yy / 


1 Qfl 

loU 


33 


100 


n no 
U.yo 


ZO 


n oq 
U.zo 


0.683 


74 


90 


100 


0.46 


31 


0.97 


0.955 


104 


90 


100 


0.86 


28 


0.88 


0.997 


130 


90 


100 


0.98 


25 


0.76 


0.683 


74 


uniform P{e) 


0.45 


31 


0.86 


0.955 


104 


uniform P(s) 


0.81 


28 


0.66 


0.997 


130 


uniform P(s) 


0.89 


25 


0.47 



VIII. CONCLUSIONS 

The description of the measurement problem as a linear system of equations illuminates 
several important aspects of binomial experiments, including intuitive notions about the 
relative value of choosing cuts which preserve the signal of interest while rejecting non- 
interesting backgrounds. The use of Bayesian techniques in this solution of the binomial 
measurement problem offers a straightforward method of measuring signal and background 
fractions in both the total data sample and the subset of events which survive the application 
of a cut. It also provides an unbiased means of testing different cuts and is useful in 
evaluating potential improvements that come from increasing data samples. The method 
was also shown to be a powerful tool that can be used in the analysis of excess observed 
events over theoretical expectations. 



26 



REFERENCES 



B. Noble and J. W. Daniel, Applied Linear Algebra (Prentice-Hall, 1977), p. 170. 

A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian Data Analysis (Chap- 
man & Hall, 1995) p. 31. 

G. D'Agostini, Probability and Measurement Uncertainty in Physics - a Bayesian 
Primer, frep-ph/ 95 122951 (1995). 

D. S. Sivia, Data Analysis: A Bayesian Tutorial (Oxford University Press, 1996), p. 71. 

C. Caso et al. (Particle Data Group), The European Physical Journal C3 (1998). 
F. Abe et al. (CDF Collaboration), Phys. Rev. Lett. 74 2626 (1995). 

S .Abachi et al. (D0 Collaboration), Phys. Rev. Lett. 74 2632 (1995). 
F. Abe et al. (CDF Collaboration), Phys. Rev. D 59 092001 (1999). 
F. Abe et al. (CDF Collaboration), Phys. Rev. Lett. 76 3070 (1996). 

B. Abbott et al. (D0 Collaboration), submitted to Phys. Rev. D, Fermilab PUB-99/171- 

E, |hep-ex/9906U23j 



B. Abbott et al. (D0 Collaboration), to be published in Phys. Rev. D, Fermilab PUB- 
99/015-E, |hep-ex/990l04D|. 



O. Helene, Nucl. Instrum. Methods 212. 319 (1983). 

F. Abe et al. (CDF Collaboration), Phys. Rev. Lett. 79 3819 (1997). 

A. Stange, W. Marciano, and S. Willenbrock, Phys. Rev. D 49 1354 (1994). 



27 



